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(54) Apparatus and method for presenting mixed virtual reality shared among operators 

(57) There is disctosed a mixed reality presentation 
apparatus which generates and displays a three-dimen- 
sional virtual image on a see-through display device so 
as to allow a plurality off players to play a multi-player 
game in a mixed reality environment The apparatus 
has a CCD camera (230) for detecting the mallet posi- 
tions of the plurality of players, and a sensor (220) for 
detecting the view point position of each player in the 
environment of the multi-player game. The apparatus 
generates a three-dimensional virtual image that repre- 
sents a game result of the muHi-player game that has 
progressed in accordance with ctianges in mallet posi- 
tion detected by the CCD camera (230) and is viewed 
from the view point position of each player detected k)y 
the sensor (220). and outputs the generated image to 
the corresponding see-through display device. The 
apparatus detemnines the motion off each player by 
detecting infrared rays output from the con-esponding 
mallet on the basts of an image captured by the CCD 
camera (230). The view point position detected by the 
sensor (220) is corrected by specifying the marker in an 
image obtained by a camera (240) attached to the head 
of each player, and comparing the marker position in 
that image with an actual marker position. 
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Description 

BACKGROUND OF THE INVENTION 

5 [0001] The present Invention relates to a mixed reality presentation apparatus for presenting to a user or operator 
mixed reality which couples a virtual image generated by computer graphics to the real spac . The present invention 
also relates to an improvement off precise detection of, e.g., head position and/or posture off an operator to which mixed 
reality is presented. 

[0002] In recent years, extensive studes have t>een made about mixed reality (to be abbreviated as "MR" hereinafter) 
10 dir cted to seamless coupling of a real space and virtual space. MR has eamed widespread appeal as a technique for 
enhancing virtual reality (to be abbreviated as "VR" hereinafter) for the purpose of coexistence of the real space and 
the VR world that can be experienced in only a situation isolated from the real space. 

[0003] Applications of MR are expected in new fields qualitatively different ffrom VR used so far, such as a medical 
assistant use for presenting the state of the patient's body to a doctor as if it were seen through, a work assistant use 
75 for displaying the assembling steps of a product on actual parts in a factory, and the like. 

[0004] These applications commonly require a techroque of removing "deviations" between a real space and virtual 
space. The "deviations" can be classified into a positional deviation. time deviation, and qualitative deviation. Many 
attempts have been made to remove the positional deviation (i.e., alignnrmnt) as the nfK>st fundamental requirement 
among the atxive deviations. 

20 [0005] In case of video-see-through type MR tiiat sup^poses a virtual object on an image sensed by a vkJeo camera, 
the alignment prot)lem reduces to accurate determination of the three^iimensional position off that video camera. 
[0006] The alignment probim in case off optical-see-through type MR using a transparent HMD (Head Mount Display) 
amounts to determination of the thre&dimensional position off the user's view point. As a method off measuring such 
posftioa a ttiree-dimensk>nal position-azimuth sensor such as a magnetic sensor, ultrasonic wave sensor, gyro, or the 

25 like is normally used. However, the prectskm of such sensors is not sufficient, and their errors produce positional devi- 
ations. 

[0007] On the other hand, in the video^see-through system, a method of direct alignnient on an image on tiie basis 
of image information without using such sensors may be used. With this me^od, since positional deviation can be 
dir ctty processed, alignment can t>e precisely attained. However, this method suffers other problems, i.a, non-real- 
30 time processing, and poor reliability. 

[0008] In recent years, attempts for realizing precise alignment by using both a position-azimutii sensor and image 
inffomnation since they corrpensate for each other's shortcomings have been reported. 

[0009] As one attempt, "Dynamic Re^stration Conection in Video-Based-Augmented Reality Systems" (Bajura 
Michael and Ulrish Neuman, IEEE computer Graphics and Applk»tions 1 5, 5, pp. 52 - 60, 1 995) (to be refenred to a first 
35 reference hereinafter) has proposed a method off correcting a positional deviation arising from magnetic sensor errors 
using image information in video-see-through MR. 

[0010] Ateo. "Superior Augmented Reality Registration by Integrating Landmark Tracking and Magnetic Tracking" 
(State Andrei et al., Proc. off SIGGRAPH 96, pp. 429 - 438, 1 996) (to be referred to as a second reference hereinafter) 
has proposed a method whk^h further develops the above method, and compensates for ambiguity of position estima- 
40 tion based on image information. The second reference sets a landmark, the three<fimensk>nal position of which is 
known, in a real space so as to remove any position deviation on an image caused by sensor errors when a video«ee- 
through MR presentation system is built using only a position^imuth sensor. This landmaric serves as a yardstick for 
detecting the positional deviation from image information. 

[001 1 ] Iff the output from the positionnazimuth sensor does not include any errors, a coordinate point (denoted as Q|) 
45 off tiie landmarit actually observed on the image must agree with a predicted observation coordinate point (denoted as 
Pi) of the landmark, which is calculated from the camera position cbtained based on the sensor output, and the three- 
dimensional position off the landmark. 

[0012] However, in practice, since the camera position obtained based on the sensor output is not accurate, Q| and 
Pi do not agree with each other. The deviation between the predk;ted observation coordinate Q| and land mark coordi- 
50 nate P| represents the positional deviation between tiie landmark positions in the virtual and real spaces and, hence, 
the direction and magnitude of the deviation can t>e calculated by extracting the landmark position from the image. 
[001 3] In ttiis way, by qualitatively measuring the positional deviation on the image, tiie camera position can be cor- 
rected to renrme the positional deviation. 

[0014] The sirrplest ali^^ment method using both position-azimutii sensor and image is correction of sensor errors 
55 using one point of landmart^ and tiie first reference proposed a mettiod of translating or rotating tiie camera position in 
accordance witti tiie positional deviation of tiie landmark on the imaga 

[001 51 Rg. 1 shows the basic concept off positional deviation corectfon using one point of landmaric In the following 
desaiption, assume tiiat the internal parameters of a camera are known, and an image is sensed by an ideal image 
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sensing system free from any influences of distortion and the like. 

[001 6] Let C be the view point position of the camera, Q| be the observation coordinate position of a landmark on an 
image, and Qq be the landmark position in a real spac . Then, the point Q| is present on a line Iq that connects the 
points C and Qq. On th other hand, from the camera position given by the position-azimuth sensor, a landmark posi- 

5 tion Pc on the camera coordinate system, and Hs observation coordinate position P| on the image can be estimated. In 
th following description, v^ and V2 respectively represent three-dimensional vectors from the point C to the points Q| 
and P|. In this method, positional deviation is conrected by modifying relative positional information behween the camera 
and object so that a corrected predicted observation coordinate position P'| of the landmark agrees with Q| (I.e., a cor- 
rected predicted landmark position P'c on the camera coordinate system ^ present on the line Iq). 

10 [001 7] A case will be exarraned bek>w wherein the positional deviation of the landmark is corrected by rotating the 
camera position. This correction can be realized by nrKxlifying the position information of the camera so that the camera 
rotates an angle q that the two vectors v^ and V2 make with each other. In actual cateulations. vectors v^n and vgn 
obtained by normalizing the above vectors v^ and V2 are used, their outer product v^ „ x vgn is used as the rotation axis, 
their inner product v^^ * ^2n "sed as the rotation angle, and the camera is rotated about the point C. 

15 [001 8] A case will t>e examined bekm wherein the positional deviation of the landmark is connected by relatively trans- 
latirig tiie camera position. This correction can be realized by translating the object position in the virtual workJ by 
v = n(v^-V2). Note that n is a scale factor define ~ " - 

20 (1) 



Note that |AB| is a symbol representing the cfistance between points A and B. Likewise, connection can t>e attained by 
modifying the position Information of the camera so that the camera translates by -v. This is because this manipulation 

25 is equivalent to relative movement of a virtual ot>ject by v. 

[001 9] The akxyve-mentioned two methods two<limensior^ly adjust ttie positional deviation on the landmark but can- 
not correct the camera position to a three-dimensionally correct position. However, when sensor errors are small, these 
methods can expect sufficient effects, and the calculation cost required for correction Is very small. Hence, these meth- 
ods are excellent in real-time processing. 

30 [0020] However, the above references do not consider any collaborative operations of a plurality of operators, and can 
only provide a mixed reality presentatkxi system by a sole operator. 

[0021] Since the methods descrit>ed in the references need to detect a coordinate of the only land mark within tiie 
sensed image, thus, have limitations that a specifk: marker as a mark for alignment must always be sensed by the cam- 
era, they allow observation within only a limited range. 
35 [0022] The above limitation derived from using the single land mark is fatal to construction of nvxed reality space 
shared by a plurality of users or operators. 

SUMMARY OF THE INVENTION 

40 [0023] The present invention has been made in consideration of tiie conventional problems, and has as its object to 
provKle an apparatus ttiat presents a collatx}rative operation of a plurality of operators by mixed reality. 
[0024] In order to athiwe the above object, accorcfing to the present invention, a mixed reality presentation apparatus 
which generates a three-dimensional virtual image associated with a collatx)rative operation to be done t>y a plurality 
of operators (2000, 3000) in a predetermined mixed reality environment, and displays the generated virtual image on 

45 see-through display dances (210U 21 OR) respectively attached to the plurality of operators, comprises: 

first sensor means (5010, 230)f6rdetectingapositionof each of actuators (260L, 260R) which are operated by the 
plurality of operators and trme as the collabaative operation progresses; 

second sensor nrYeans (220. 5000, 5040) for detecting a view point position of each of the plurality of operators in 

so an environment of the collaborative operation; and 

generation means (5030. 5050R. 5050L) for generating three-dimensional images for the see-through cfisplay 
devices of the indivkiual operators, the generation means generating a three^limensional virtual image represent- 
ing an operation result of the collaborative operation tiiat has progressed according to a change in position of each 
of the plurality of actuators detected by the first sensor means when viewed from the view point position of each 

55 operator detected by the second sensor means, and outputting the generated threoKlimensional virtual image to 
each see-through display device. 

[0025] Since the f ir^ sensor means of the present invention detects the positions of the individual actuators operated 



3 



EP0899 690A2 



by the operators, the positional relationship between the actuators of the operators can be systennatically recognized, 
and mixed reality based on their collaborative operation can k>e presented without any positional deviation. 
[0Q26] In order to track the collaborative operation by all the operators, a camera which covers substantially all th 
operators within its field of view is preferably used. Hence, according to a prefen'ed aspect of the present invention, the 
5 first sensor means comprises: 

an image sensing camera (230) which includes a maximum range of the actuator within a field of view thereof, the 
position of the actuator nxiving ipon operation of the operator; and 

image processing means for detecting the position of the actuator t)y image processing from an image obtained by 
10 the camera. 

[0027] In order to present mixed reality based on the collaborative operation, detection of some operations of the 
operators suffices. For this reason, according to a preferred aspect of the present invention, when the first sensor 
means uses a camera, the actuator outputs light having a predetermined wavelength, and the f irst sensor means com- 
IS prises a camera which is sensitive to the light having the predetermined wavelength. 

[0028] According to a preferred aspect of the present invention, the actuator is a mallet operated by a hand of the 
operator. The mallet can be easily applied to a mixed reality environment such as a gama 

[0029] According to a preferred aspect of the present invention, the see-through display device comprises an optical 
transmisston type display device. 
20 [0030] According to a preferred aspect of the present invention, the second sensor means detects a head position 
arxJ posture of each operator, and calculates the view point position in accordance with the detected head position arxi 
posture. 

[0031 ] In order to detect the three-dimensional posture of the head of each operator, a magnetic sensor is preferably 
used. TTierefore, according to a preferred aspect of the present invention, the secorxi sensor means conrprises a trans- 
25 mitter (250) for generating an AC nrtagnetic field, and a magnetic sensor (220L, 220R) attached to the head portion of 
each operator. With this arrangement, the three-dimensional posture of the head of each operator can be detected in a 
non-contact manner. 

[0032] According to a prefenred aspect of the present invention, the generation means corrprises: 

30 Storage means (S10 - S52) for staing a rule of the collaborative operation; 

means (5050L, 5050R) for generating a virtual image representing a progress result of the collaborative operation 
in accordance with the rule stored in the storage means in correspondence with d^ected changes in position of the 
plurality of actuators; and 

means (5050L. 5050R, S440) for generating a thre&dimensional virtual image for each view point position by 
35 transferring a coordinate position for each view point position of each operator detected t>y the second sensor 
means. 

[0033] Similarly, in order to achieve the akxyve object, according to the present invention, a mixed reality presentation 
apparatus which generates a three-dimensional virtual image associated with a collakxvative operation to be done by 
40 a plurality of operators in a predetermined mixed reality environment and displays the generated virtual image on see- 
through display devices (210L. 210R) respectively attached to the plurality of operators, comprises: 

a camera (230) which includes a plurality of actuators opiated by the plurality of operators in the collaborative 
operation within a field of view thereof; 
45 actuator position detection means (230, 5010) Ibr outputting information assodated with positions of the actuators 
on a coordinate system of tfiat environment on the basis of an image sensed by the camera; 
sensor means (220L, 220R, 5000) for detectirtg and outputting a view point position of each of the plurality of oper- 
ators in the environment of the collat>oratlve operation; and 

image generation means (5030, 5050R, 5050L) for outputting a three<limensional virtual image of a progress 
so result viewed from the view point position of each operator detected by the sensor means to each see-through dis- 
play device so as to present the progress result of the collaborative operation that has progressed according to 
detected changes in position of the actuator to each operator. 

[0034] The akx3ve object is also achieved by a mixed reality presentation apparatus which generates a three^Jimen- 
55 sional virtual image associated with a collaborative operation to be done by a plurality of operators in a predetermined 
mixed reality environment and displays the generated virtual imag on see-through display devices (210U 210R) 
respectively attached to the plurality of operators. This apparatus comprises: 
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a camera (230) which includes a plurality di actuators operated by the plurality of operators in the collaborative 
operation within a field off view thereof; 

actuator position detection means (230, 5010) for outputting information associated with positions of the actuators 
on a coordinat system of that environment on the basis of an Image sensed by the camera; 
5 sensor means (220L. 220R, 5000) for detecting and outputting a view point position of each of the plurality of oper- 
ators in the environment of the collaborative operation; and 

image generation means (5030. 5050R. 5050L) for outputting a three^iimensional virtual image of a progress 
result viewed from the view point position of each operator detected by the sensor means to each see-through dis- 
play device so as to present the progress result of the collaborative operation that has progressed according to 
10 detected changes in position of tiie actuator to each operator. 

[0035] The atxive object is also achieved by a mixed reality presentation apparatus which generates a thre&dimen- 
sional virtual image associated with a collaborative operation to be done by a plurality of operators In a predeternuned 
mixed reality environment, and displays the generated virtual Image on see-through display devices (21 OL. 21 OR) 
IS respectively attached to the plurality of operators. This comprises: 

a first carnera (230) which 8ut)stantially indudes the plu^^ 

a first processor (5010) for calculating operation positions of the plurality of operators on the basts of an Image 
obtained by the first camera; 

20 a d^ection device (5000) for detecting a view point position off each operator using a plurality of sensors (220L. 
220R) attached to tiie plurality of operators; 

a plurality of second cameras (240L, 240R) for sensing frorrt fields of ttie individual operators, at least one second 
camera being attached to each of the plurality of operators; 

a second processor (5060L. 5060R) for calculating information associated with a line off sight of each operator on 
25 the basis of each off images from the plurality off second cameras; 

a third processor (5040L, 5040R) for correcting the view point position of each operator detected by the sensor 
using the line of sight information from the second processa and outputting the corrected view point position as a 
position on a coordinate system of the ntixed reality environment; 

a first image processing device (5030. 5050U 5050R) for making the collatxHative operation virtually progress on 
30 the basis off the operation position off each operator calculated by the first processor, and generating three-dimen- 
sional virtual Images representing results that have changed along with the prog'ess of the collaborative operation 
for the plurality off operators; and 

a second Image processing device (5040U 5040R, 5050R. 5050L) for transferring coordinate positions off the 
three-dimensional virtual Images for the Individual operators generated by the first Image processing device in 
35 accordarx^e with the individual corrected view point positions calculated by the third processor, arxi outputting the 
coordinate-transferred images to tiie see-through display devices. 

[0036] The atx>ve object Is also achieved t>y a method off generating a three-dimensional virtual image associated witii 
a collaborative operation to be done within a predetermined mixed reality environment so as to display the image on 
40 see-through display devices attached to a plurality off operators In the mixed reality environment. This method com- 
prises: 

the image sensing step off sensing a plurality off actuators operated by the plurality off operators by a camera that 
includes the plurality of operators within a field of view tttereof ; 
45 the actuator position acquisition step of calculating information associated with positions of the actuators on a coor- 
dinate system off the environment on the basis off the image sensed by the camera; 

the view point position detection step off detecting a view point position off each off the plurality off operators in the 
environment off the collaborative operation on the coordinate system of the environment; 
the progress step of making the collaborative operation virtually progress In accordance with changes in position 
so off the plurality off actuators calculated In the actuator position acquisition step; and 

the Image generation step off outputting a three-dimensk>nal virtual image of a progress result in the progress step 
viewed from tiie view point position of each operator detected in the view point position detection step to each see- 
through display device so as to present the progress result In tiie progress step to each operator. 

55 [0037] The akx>ve object is also acNeved k>y a mixed reality presentation method for generating a three-dimensional 
virtual image associated witti a collaborative operation to be done by a plurality of operators (2000, 3000) in a prede- 
termined mixed reality environment, arxi displaying the generated virtual Imag on see-through display devices (21 OL. 
21 OR) respectively attached to the plurality off operators. This method comprises: 
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the lirst image sensing step of capturing an imag using a first camera (230) which sii>stantially includes the plu- 
rality of operators within a field of view thereof; 

the first detection step of detecting operation positions of the plurality of operators on the t>asis of the image sensed 
by the first camera; 

the second detection step of detecting a view point position of each operator using a plurality of sensors (220L. 
220R) respectively attached to the plurality of operators; 

the second image sensing step of sensing a front field of each operator using each of a plurality of second cameras 
(2401^ 240R), at least one second camera being attached to each of the plurality of operators; 
the line of sight calculation step of calculating information associated witii a line of sight of each operator on the 
basis of each of images obtained from the plurality of second cameras; 

the correction step of correcting ttie view pant position of each operator detected by the sensor on the basis of the 
line of si^ information calculated In the line of sight calculation step, and obtaining the corrected view point posi- 
tion as a position on a coordinate system of the mixed reality environment; 

the generation step of making the collaborative operation virtually progress on the basis of the operation positions 
of the individual operators detected in the first detection step, and generating three-dimensional virtual images that 
repr^ent resulte of the collaborative operation and are viewed from the view point positions of the plurality of oper- 
ators; and ~ - . 
the step of transferring coordinate positions of tiie three-dimensional virtual images fa the individual operators 
generated in the generation step in accordance with the individual corrected view point positions obtained in the 
correction step, and outputting the coordinate-transferred images to the see-through display devices. 

[0038] It is another object of the present invention to provide a position posture detection apparatus and method, 
which can precisely capture an operator who nrKves across a broad range, and a mixed reality presentation apparatus 
based on the detected position and postura 
25 [0039] In order to achieve the above object, the present invention provides a position/|x>sture detection apparatus for 
detecting an operation position of an operator so as to generate a three-dimensional virtual image that represents an 
operation done by the operator in a predetermined mixed reality environment, conrprising: 

a posrtion/kx>sture sensor (220) for measuring a three^ilmensional position and posture of the operator to output 
30 an operator's position and posture signal; 

a camera sensing images of a first plurality of markers arranged at known positions in the environment; 
detection means for processing an image signal from said camera, tracking a marker of the first plurality of markers, 
and detecting a coordinate value of the tracked mari^ in a coordinate system; arxJ 

cateulation means for calculating a portion-position arvi -posture representing a position arvi posture of the operat- 
35 ing portion, on the basis of the coordinate value of the tracked marker detected t>y said detection means and the 
operator's position and posture signal outputted from the posttion/jposture sensor. 

[0040] In order to achieve the above object, the present invention provides a position4x>sture detection method for 
detecting an operation |X)6ftion of an operator so as to generate a threedimensfonal virtual image associated with an 
40 operation to be done by the operator in a predetermined mixed reality environntent. comprising: 

the step of measuring to output an operator position/Jposture signal indicative of a three-dimensfonal position and 
posture of the operator; 

the step of processing an image signal from a camera whteh captures a plurality of markers arranged in the envi- 
45 ronment, tracking at least one marker arxl detecting a coordinate of said at least one marker; arxi 

outputting a head position/jposture signal indk;ative of a position and posture of the head of the operator, on the 
basis of the coordinate of the tracked marker and the measured operator position/kx)sture signal. 

[0041 ] In order to achieve the above object, the present invention provides a position/jposture detection apparatus for 
50 detecting an operation position of an operator, comprising: 

a posftion/jposture sensor for measuring a threeKJimensfonal position and posture of ttie operator to output an oper- 
ator's position and posture signal: 

a camera sensing images of a first plurality of markers arranged at known positions in the environment; 
55 detection means for processing an image signal from said camera, tracking a marker of the first plurality of markers, 
and detecting a coofdinate value of the tracked marker in a coordinate system; and 

correction means for connecting an output signal from the sensor on the basis of coordinate value of the tracked 
marker. 
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[0042] In order to achieve the above object the present invention provides a mixed reality presentation apparatus 
comprising: 

a work table having a first plurality of markers ananged at known positions; 
5 a position/|30sture sensor attached to an operator to detect a head postur of the operator; 

a camera being set to captur at least one of the first plurality of markers within a field of view of the camera; 
a detection means for processing an image signal from the camera, tracking a marker from among the first plurality 
of markers, and detecting a coordinate value of a tracked marker; 

calculation means for calculating a position/tx>sture signal representing a position and posture of the operator's 
10 viewpoint, on the basis of the coordinate value of the tracked marker detected by said detection means and an 
operator's head position/|FX>sture signal outputted from the posrtion/jposture sensor; and 
generation means for generating a virtual image for presenting a mixed reality at the view point in accordance with 
the calculated position/posture signal. 

IS [0043] The detection apparatus and method according to the invention as s^ forth can correct or detect a position 
and posture of the operator precisely even when the operator moves within a wide range environment since at least 
one marker ^ assured to be captured Iri the Irnage by ttie c^^ 

[0044] According to a preferred aspect of the invention, the markers are arranged so that a distance between one 
marker and another marker of the plurality of mark^ in a direction crossing in front of the operator ^ set to be larger 
20 as the markers are farther from the operator. This prevents from deterioration of precision in kientifying a marker. 
[0045] Accorcfing to a preferred aspect of the Invention, the markers are arranged so that a layout distritxition density 
of the plurality of markers in the environment Is set so that a density disbibution of markers farther from the operator Is 
set to be lower than a density distribution of markers closer to the operator. This also prevents from deterioration of pre- 
cision in Identifying a marker. 

25 [0046] According to a preferred aspect of the invention, where a plurality of operators perform a collaborative opera- 
tion, markers for one operator are of the same representation manner. The markers for one operator have the same 
color, for example. This fadlitates to discriminate markers from those for each other operator. 
[0047] According to a preferred aspect of the Invention, the portion Is a view point position of the operator. 
[0048] According to a preferred aspect of the invention, said detection means uses a marker firstly found within an 

30 Image obtained by saxJ camera. It Is not necessary to keep to tack one marker in tfre invention. It is enough for any one 
marker to be found. Using a first found marker facilitates to search or track a marker. 

[0049] According to a preferred aspect of the invention, the detection means searches an image of a present scene 
for a marker fomd in an image of a prevknis scena This assures continuity in the tracking. 

[0060] The sensor may k)e mounted anywhere of the operator. According to a preferred aspect of the Invention, the 
35 sensor Is mounted on the head of the operator. The sensor is ck)se to the view point of the operator. This fadlitates 
applk^ation to HMD. 

[0051 ] According to a preferred aspect of the invention, the first plurality of markers are anranged wrttiin the environ- 
ment so that at least one marker is captured within the fieki of image of the camera. 

[0052] Detection of tracked marker can be made in various coordinate systems. According to a preferred aspect of 
40 the invention. sakJ detection means calculates a coordinate of the tracked mariner in an Image coordinate system. 
According to a pr^erred aspect of the invention, sakJ detection means calculates a coordinate of the tracked marker in 
camera coordinate system. 

[0053] According to a preferred aspect of the invention, the first plurality of markers are depkited on a planar talkie 
arranged within the environment. This Is suitable for a case where the collaborative operation is made on the table. 
45 [0054] According to a preferred aspect of the invention, said first plurality of markers are arranged in a three-dimen- 
sional manner. This aspect is suitable for a case where mariners rrtust be arranged in a ttiree-dimenslonal manner. 
[0055] According to a preferred aspect of the invention, the detection means comprises identifying means for kienti- 
fying a marker to be tracked from among sakJ first plurality of markers. 

[0056] Similariy, according to a preferred aspect of tiie invention, the detection means comprises means for selecting, 
50 where saki detection means detects a second plurality of mariners within an image capture by said camera, one marker 
to be tracked from among sakJ second plurality of markers. 

[0057] According to a prefen^ed aspect of the inventioa the Mentifying means identifies a mariner selected by the 
selection means in temis of an Image ooorcfinate system. 

[0058] According to a furtfier aspect of the invention, the kientifying means comprises: 

55 

means for detecting a signal representing a position^posture of the camera; 

means for converting three-cSmenslonal coordinates of saki first plurality of markers in the worid coordinate system 
into a coordinate value in t rms of ttie image coordinate system, in accordance with th signal representing posi- 
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tjon/|30stur of the camera; and 

means for identrfying a marker to be tracked by comparing the coordinates of the first plurality of markers In th 
image coordinate system and an image coordinate value of the tracked marker. 

5 [0059] According to another aspect of the invention, th Identrfying means identifies a marker selected by the selec- 
tion means In terms of a world coordinate system. And. according to yet further aspect of the invention, the identifying 
means comprises: 

means for detecting a signal representing a posFtion/^x^sture of the camera; 
70 means for converting a coordinate of the tracked marker in terms of a camera coordinate system Into a coordinate 
value in terms of ttie world coordinate system; and 

selection means for selecting said at least one marker to be tracked by conrparing coordinates of the second plu- 
rality of markers and coonfinates of the first plurality of markers, in terms of the worid coorcfinate system. 

15 [0060] Where an image coordinate system is used, according to a yet further aspect of the invention, the operation 
portion includes a view position of the cperata. 

said calculation means obtains a position/i:x>sture si^ial at a view point of the operator on the basis of: 
said operator position/|posture signal, and 
20 a dstance difference between an image coordinate value of the tracked marker and a coordinate value of the 
tracked marker which is converted from a three dimensional coordinate of the marker in the wortd coordinate sys- 
tem. 

[0061 1 Where a world coordinate system is used, according to a yet furtiier aspect of the invention, the operation por- 
25 tion includes a vie position of tiie operator. 

said calculation means obtains a position4x>sture si^ial at a view point of the operator on the basis of: 
said operator position/jposture signal, and 

a distance difference t>etween a coordinate value of the tracked marker whk;h is converted from the camera coor- 
30 dinate system into the worW coordinate system and a three dimensbnal coordinate of the marker in the worki coor- 
dinate system and a coordinate value of the tracked marker. 

[0062] The camera may comprises plural camera units. This allows to detect a coordinate of a tracked marker in a 
camera coordinate system. Thus, Error in the position/jDOSture sensor is corrected in thre&dimensional manner. Fur- 
35 ther, The tracked marker is identified in the world coordinate system, the multiple csimeras can cope with the markers 
arranged three-dimensionally. Furthermore. Predseness in kientifying a racked marker is improved compared with tfiat 
in the image coordinate system. 

[0063] Ottier features and advantages of the present invention will be apparent from the following description taken 
in conjunction with the accorrpanying drawings, in whk;h like reference characters designate the same or similar parts 
40 throughout tt)e figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0064] 

45 

Rg. 1 is a view for explaining the principle of camera position correction. whk;h is applied to ttie prior art and an 
embodiment of the present invention; 

Rg. 2 is a side view showing the arrangement of a game apparatus used in the first embodiment of tfie present 
invention; 

50 Rg. 3 is a view for explaining a scene that can been seen within the f ieki of view of the left player in the game appa- 
ratus shown in Rg. 2; 

Rg. 4 is a view for explaining tiie arrangement of an HMD used in the game apparatus shown in Rg. 2; 
Rg. 5 is a view for explaining the layout of markers set on a table of the game apparatus shewn in Rg. 2; 
Rg. 6 is a view for explaining transition of markers included in an image captured by a camera attached to the head 
55 of the player along with the movement of tiie player on the table shown in Rg. 5; 

Rg. 7 is a block diagram for explaining the arrangement of a three-dimensional image generation apparatus for tti 
game apparatus of the first entxxtiment; 

Rg. 8 is a f tow chart for explaining the processing sequence by a mallet position measurement unit of the first 
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embodiment; 

Fig. 9 is a flow chart tor explaining a subroutine (local search) of the processing sequence by the mallet position 
measurement unit of the first embocfiment; 

Fig. 10 is a flow chart for explaining a subroutine (global search) of the processing sequence by the mallet position 
5 measurement unit of the first embodiment; 

Fig. 1 11s a view for explaining segmentation of the regions to be processed used in the processing of the flow chart 
shown in Rg. 8; 

Fig. 12 is a view showing the method of setting the regions to be processed used in the processing of the flow chart 
shown in Rg. 8; 

10 Fig. 13 is a view for explaining a virtual game field in the game of the first embodiment; 

Fig. 14 Is a flow chart Ibr explaining the control sequence of game management in a game status management unit 
of the first ennbodiment; 

Fig. 15 is a view for explaining a method of detecting a mallet; 

Fig. 16 Is a flow chart for explaining the overall processing sequence of a correction processing unit in the first 
15 embodiment; 

Fig. 1 7 is a flow chart for explaining some steps (marker tracking) in the flew chart shown in Rg. 1 6 in detail ; 
Rg. 18 is a fbw chart for explaining some steps (marker position precfictbn) in the flew chart shown in Rg. 16 in 
detail; 

Fig. 19 is a view for explaining the principle of detection of a reference marker used in correction; 
20 Fig. 20 is a flow chart for explaining the principle of detectfon of a reference marker; 

Fig. 21 is a view showing the arrangement of an HMD used in the second embodiment; 

Fig. 22 is a block diagram showing the arrangement of principal part of an image processing system of the second 

embodiment; 

Fig. 23 is a f k3w chart showing some control steps of the image processing system of the second emtxxiiment; 
25 Fig. 24 is a view for explaining transition of a reference marker used in a modif k^ation of the embodiment; 

Fig. 25 is a view for explaining the principle of marker search used in a nxxiification of the embodiment; and 
Fig. 26 explains the principle of the correctfon process adopted in the first embodiment 

DETAILED DESCRIPTION OF THE INVENTION 

30 

[0065] A system according to an embodiment in which a mixed reality presentatfon method and HMD of the present 
inventbn are applied to an air hockey game apparatus will be explained hereinafter. 

[0066] An air hockey game is a battle game requiring at least two players, and the players exchange shots of a puck, 
which floats in the air compressed air from the lower portion, and can score when one player shoots the puck into 
35 tfie goal of the other player. The outscored player can win the game. In the air hockey game to whk;h MR of this embod- 
iment is applied, a virtual puck is presented to a player by superpose^iisplaying it as a virtual three-dimensional image 
on a table in a real environment and the players virtually exchange shots of the virtual puck using real mallet& 
[0067] The game apparatus is featured by: 

40 o : Image-sensing with a camera a real space shared by a plurality of players, detecting and specifying actuators 
(mallets in the emtxxliments) manipulated by the operators, and presenting a mixed reality space shared by the 
players 

T: In order to detect view points of the players precisely who move within the wide real space, a camera as well as 
45 a magnetk; sensor is attached to the head of each player, senses at least one marker of markers provkJed on a 
table used for the game, and corrects position and posture (that is, view point) of a player's head detected by the 
sensor, on the basis of a difference between the image coordinate and the actual positfon of the at least one 
marker. 

so < Arrangement of Game Apparatus) 

[0068] Rg. 2 is a skJe view of the game apparatus portion of the system of this embodiment. In an MR air hockey 
game, two players 2000 and 3000 face each other white hoMing mallets (260L. 260R) with their hands. The two players 
2000 and 3000 wear head mount displays (to be abbreviated as HMDs hereinafter) 21 OL and 21 OR on their heads. TTie 
55 nnallet of this embodiment has an infrared ray generator at its distal end. As will be described later, in this embodiment, 
the mallet positbn is detected by image processing. If each mallet has a feature in its shape or color, th mallet positfon 
can also be detected by pattern recognition using such feature. 

[0069] The HMD 21 0 of this embodiment 6 of see-through type, as shown in Rg. 4. The two players 2000 and 3000 
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can observe the surfac of a table 1000 even when they wear the HMDs 210L and 21 OR. The HMD 210 receives a 
three-dimensional virtual image from an image processing system (to be described later). Hence, the players 2000 and 
3000 observe a three-dimensional image displayed on the display screen of their HMDs 210 to b superposed on an 
image in the real space observed via optical systems (not shown in Rg. 2) of th HMDs 210. 

5 [0070] Rg 3 shows an image seen by the left player 2000 via his or her HMD 21 OL The two players 2000 and 3000 
exchange shots of a virtual puck 1500. The puck 1500 Is hit by an actual mallet 260L (260R) held by th hand of the 
player 2000 (3000). The player 2000 holds the mallet 260L with the hand. The player 2000 can see a goal 1200R invne- 
diately before the opponent 3000. The image processing system (to be described later; not shown in Fig. 3) generates 
a thre&dimensional CG so that the player 2000 can see the goal 1 200R near the opponent, and displays it on the HMD 

10 21 OL 

[0071 ] The opponent 3000 can also see a goal 1200L near the player 2000 via the HMD 210R. 

[0072] The puck 1500 is also generated by the image processing system (to be described later), and is cfisplayed on 

the HMDs of the two players. 

15 (HMD with Magnetic Sensor > 

[0073] Rg. 4 shows the an^angernent of the HMD 210. This HMD 210 is obtained by attaching amagnetic sensor 220 
to the main body of an HMD in, &g., Japanese Lald-Open Patent Na 7-333551 via a column 221 . In Rg. 4, reference 
numeral 21 1 denotes an LCD display panel. Light coming from the LCD display panel enters an optical memt>er 212. 

20 and Is reflected by a total reflection surface 214. Then, the lightt is reflected by a total reflection surfiace of a convex mir- 
ror 213, is transmitted through the total reflection surlace 214, and then reaches the eyes of the observer. 
[0074] The magnetic sensor 220 used a magnetic sensor Fastrak avail^e from Polhemus Corpi Since the magnetk; 
sensor is readily influenced by magnetic noise, it is separated from the display panel 21 1 and a camera 240 as noise 
sources by means of a pole 221 made of plastic. 

25 [0075] tskTte that the arrangement obtained bf attaching the ma^etic sensor and/or camera to the HMD shown in Rg. 
4 is not limited to an optical-see-through type HMD. Also, even in a video-see-through type HMD, the magnetic sensor 
andA>r camera can t^e attached to that HMD for the purpose of accurate detection of the head position and posture. 
[0076] In Rg. 2, each HMD 210 is fixed to the player's head by a band (not shown). The magnetic sensor 220 (Rg. 4) 
and a CCD camera 240 (240L, 240R; Rg. 2) are respectively fixed to the head of the player. The f iekJ of view of the 

30 camera 240 is set in the forward direction of the player. When such HMD as comprises the magnetic sensor 220 and 
camera 240 is used in an air hockey game, since each player observes the upper surface of the tat>le 1000, the camera 
240 senses an Image of the surface of the tak3le 1000. The magnetic sensor 220 (220L, 220R) senses changes in AC 
magnetic fiekl generated by an AC magnetk; field generation source 250. 

[0077] As will be described, images sensed by the camera 240 will be utilized to correct a position and posture of 

35 head detected by the magnetic sensor 220. 

[0078] When the player looks obliquely downward to observe the surface of the table 1000, he or she can see the 
surface of the tat)le 1000, the above-mentioned virtual puck 1500, the real mallet 260 (260U 260 R), and the virtual goat 
1200 (1200U 1200R) within the field of view via the HMD 210. When the player horizontally moves the head within a 
horizontal twoKJimensk)nal plane, or makes a tilting, yaw, or rolling nrxitioa such changes are detected by the magnetk; 

40 sensor 220, and are also observed as changes in image sensed by the CCD camera 240 in accordance with changes 
In posture of the head. Specif toally, the signal indicative of head position from the magnetic sensor 220 will be corrected 
by subjecting images by the camera to image^ocessing, as will t>e described later. 

< A Plurality of Markers) 

45 

[0079] Mallet 260 hekl by each player has an infrared ray generator at its distal end, and each mallet position in a two- 
dimensKHial plane on the tat)le 1000 is detected tiy a CCD camera 230 that detects the infrared rays. Specif icalty, the 
camera 230 is provided so that it may detect mallet positions of the players, and the detected positions of the players 
will be used for advancing or processing the game, for this embodiment. 

50 [0080] On the other hand, the CCD camera 240 outputs an image called a marker image. 

[0081] Rg. 5 shows an example of the layout of markers on the tat)le 1000. In Rg. 5. five landmarks, l.e., markers 
(1600 to 1604) irxik^ated k>y circular marks are used for helping detect the head position of the player 2000, and five 
landmarks, i.e., markers (1650 to 1 654) indicated by square marks are used for h^ing detect the head position of the 
player 3000. When a plurality of markers are arranged, as shewn in Rg. 5, the marker seen by the player is detemtined 

55 t>y the player's head position, especially, the posture. In other words, when the marker sensed by the CCD camera 240 
attached to each player is specified to detect th position in the image, the output signal from the magnetic sensor for 
detecting the head posture of the player can be corrected. 

[0082] Htita tfiat the drcular and squar marks in Rg. 5 are used for the purpose of illustration, and these marks have 
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no features in their shape but may have any other arbitrary shapes. 

[0083] The marker groups (1600 to 1604. 1650 to 1654) assigned to the two players (2000, 3000) have different 
colors. In this embodiment, the markers for the left player (#1 player) are red. and those for the right player (#2 player) 
are green. Such colors alk3w easy identification of the markers in the image processing. 
5 [0084] It may be proposed to identify markers with the shape and/or textur of them, but not with cotor thereof. 
[0085] The fealure of this embodiment ties in th use of a plurality of markers. Since a plurality of markers are used, 
at least one marker always falls withdn the f ieki of view of the CCD camera 240 as long as the player plays the game on 
the table 1000 within the operation range of the air hockey gama 

[0086] Rg. 6 illustrates the moving state of the image processing ranges for detecting the markers as the player var- 
10 iously rTK3ves the head. As shown in Rg. 6. one image includes at least one marker. In other words, the nurTt)er of mark- 
ers, the interval between adjacent markers, and the like shouM be set in correspondence with the size of the table 1 000. 
thefiekJ angle of the camera 240, and the size of the moving range of each player based on the nature of the gama In 
the example of Rg. 5, since a broader range falls within the field of view as the markers are farther from the player, the 
interval between adjacent markers mi^ be increased. This arrangement sets the distance between nearby markers in 
IS the image to be equal to that between farther rfiarkers. and maintains the nunrt>er of markers contained within an image 
of a far area to be k>w. With such setups, detenoration in precision of the marker detection will t>e avoided. Thus, both 
theneart>yandfarth^markers have substantially equal marker densities captured in the irragerand toomany nnarkers 
can be prevented from being unwantedly sensed in an frama 

[0087] As will be describe later, the embodied game apparatus do not have to a lot of markers, at least one marker 
20 that is sensed into images by the camera is enough. The apparatus does not have to keep to track the same marker 
while the game progresses. 

{ MR Image Generation System) 

25 [0088] Rg. 7 shews a three-dimensional image generation/^esentation system in the ganie apparatus shown in Rg. 
2. The image generatk>n^resentation system outputs three-dimensional virtual images (the puck 1500 and goate 1200 
in Rg. 3) to the display deuces of the HMD 210L of the left player 2000. and the HMD 210R of the right player 3000. 
Right and left parEdlax images for three-dimensional virtual images are generated t>y image generation units 5050L and 
5050R. In this embodiment the image generation unit 5050 used a computer system "Onyx2" available from Silicon 

30 Graphk:s, Inc., U.S.A. 

[0089] Each image generation unit 5050 receives puck position information generated tiy a gate status managenient 
unit 5030, and information associated with the corrected view point position and head direction generated by two cor- 
rection processing units 5040L and 5040R. The game status management unit 5030. and correction processing units 
5040L and 5040R used the corrputer systems "C>nyx2". 

35 [0090] The CCD camera 230 fixed above the center of the table 1 000 can capture the entire surface of the table 1 000 
within its fieM of view. Mallet information acquired by the camera 230 \& input to a mallet position measurement unit 
5010. The measurement unit 5010 simtlarty used a computer system "02" availat)le from Silicon C^raphrcs. Inc. The 
measurement unit 5010 detects tiie mallet positions of the two players. i.e.. their hand positions. The information asso- 
ciated witii the hand positions is input to the game status management xsvX 5030, which manages the game stata More 

40 Specifically, the game state and progress of the ganie are basically determined by the mallet positions. 

[0091] A position/|30Sture detection unit 5000 conprising a computer system "02" available from Silicon Graphics. 
Inc., detects the view point positions and head postures of the two players (that are, position and posture of the sensor 
220 HselO by receiving the outputs from the two magnetic sensors 220L and 220R, detects view points position (X. Y. 
Z) and posture (p, r, ^) at camera 240 nwunted on each player, and then outputs them to the correction processing units 

45 5040Land5040R. 

[0092] On the ottier hand, the CCD cameras 240L arxJ 240R fixed to the heads of the players acquire marker images, 
which are respectively processed by marker position detection units 5060L and 5060R so as to detect positions of 
tracked markers falling within the respective f iekis of view of the individual cameras 240. The information associated 
with the marker position is input to the correction processing urvt 5040 (5040L, 5040R). 
50 [0093] ^k>te that marker position detection units 5060 (5060L, 5060R) that track respective markers in respective 
images sensed by the cameras comprised the computer systems "02". 

< Mallet Position Measurement) 

55 [0094] Figs. 8 to 1 0 are flow charts showing the control sequence for measuring the mallet position. Tracking the mal- 
let positions of the players with the single camera 230 enat)le to provide a mixed reality space shared by th players. 
Described will be the measurement of mallet positions according to the embodiment with reference to Rga 8-10. 
[0095] In the air hockey game, each player never nrK3ves his or her own mallet to the region of tti other player. I=6r 
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this reason, the processing for searching for the mallet 260L (260R) of the left player 2000 (right player 3000) need only 
be done for Image data IL (image data IR) of the left field (right field), as shown in Fig. 11. It is easy to break up the 
image acquired by the fixed CCD camera 230 Into two regions, as shown in Fig. 11. 

[0096] Hence, in the flow chart shown in Rg. 8, the processing for searcNng for the mallet 260L of player #1 (player 
5 2000) is don in step SI 00, and that for searching for the mallet 260R of player #2 (player 3000) \s done in step S200. 
[0097] Th search for the mallet of the right player (step S200) will be exemplified below for the sake of simplicity. 
[0098] In step S210. multi-valued image data of the surface of the table 1000 sensed by the TV camera 230 is 
acquired. In step S21 2, the right half image data IR of that multi-valued Image data proceeds to a sii>routine "search in 
local region". Fig. 9 shows the "search in local region" processing in detail. If the mallet coordinate position (x, y) on the 
10 Image coordinate system is found in step S21 2, the f fow advances from step S21 4 to step S220, and the mallet coordi- 
nate position (x. y) on the image coordinate system is transferred Into a coordinate position [x\ /) on the coordinate 
system (see Rg. 13) of the table 1000 using: 



IS 









hy' 




y 


_h. 
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where the matrix is a known 3x3 transformation matrix that attains calft>ration t>etween the image and table coor- 
20 dinate systems. The coordinate position (x', y*) obtained in step S220 On Fig. 3. the position (x*, y*) is indicated as the 
"hand position") is sent to the game status management unit 5030. 

[0099] If the mallet cannot be found in the local region, a subroutine "search in global region" is executed in step S216. 
H the mallet is found in the subroutine "search in global region", the obtained coordinate position is transferred into that 
on the table coordinate system in step S220. Note that the coordinate position obtained from the focal or global region 
25 is used In a search for the mallet in the local region tor the next frama 

[0100] Rg. 9 shows the processing for searching for the mallet In the local region (i e-. step S212 in detail). This 
search processing is done on the right field for the sake of simplicity, but the same applies to the mallet search process- 
ing on the leftfieki. 

[0101] In step S222, a rectangular region with a size ((2Af 1) x (2B+1) pixels) defined by the equations belcw is 
30 extracted: 

x = [r^-A.r^+A] 

y = [ry-B.ry + B] (3) 

35 

where I'x and \\ are coordinate values of the mallet posrtfon whk:h was detected in the pro/ious frame, and A and B are 
constants that deter rrvne the size of the search regfon, as shown in Rg. 12. 

[01 02] In step S230, a pixel, the feature evaluation value ls(x, y) of which satisfies a given conditioa is extracted from 
all the pixels within the rectangular regfon defined in step S222. For the pupose of finding the mallet, similarity of a pixel 
40 value (infrared ray intensity value) is preferably used as the feature amount. In this enDbodiment since the mallet has 
an infrared ray generator, an object that has a feature of corresponding infrared ray intensity is tentatively determined 
as a mallet. 

[0103] More specif k;ally, in step S232, a search for a pixel, the similarity Is of whfoh is equal to or larger than a pre- 
determined threshofo value, i.6., is dose to that of the mallet is made. If such pixel is found, a counter N stores the 
45 accumulated value of the occurrence frequency. Also, the x- and y-coordinate values of such pixel are cumulatively 
stored in registers SUM^ and SUMy. That is, 

N = N + 1 

50 SUM, = SUMjj + x 

SUMy = SUMy + y (4) 

Upon completion of step S230, the number N of all the pixels similar to the infrared ray pattern coming from the mallet 
55 in the regfon shown in Rg. 12, and the sum values SUMx and SUMy of tiie coordinate values are obtained. If N = 0, a 
result "Isfot Found" is output in step S236. If N > 0, it is determined that an object which is likely to be a mallet is found, 
and the mallet position is cafoulated in step S238 by: 
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SUM, 



SUM, 



(5) 



The calculated mallet position (l^. ly) is transferred into that on the tal3le coordinate system in step S220 (Rg. 8), and 
the transferred value is passed to the management unit 5030 as a signal representing the "hand position**. 
10 [0104] Fig. 10 shows the sequence of the global region search in step S216 in detail. 

[0105] In step 8240 in Fig. 10, the maximum value of the feature evaluation values Is anxxig pixels that satisfy: 

{(x, y) I X > 0. X < Width, x = nC, y > 0, y < Height, y = mD (where n and m are integers)} (6) 

75 in the right field image IR is stwed in a register Max. Note that C and D are constants that determine the coarseness of 
the search, and Width and Height are defined, as shown in Rg. 15. That is, it is checked in step S242 if the feature 

- - amount Ifi-exceeds the threshold value stored in the threshold value register M 
amount is set as a new threshold value in step S244 by^ 

20 Max=l3(x. ^ 

l,=x 

iy = y (7) 

25 

In step 8246. the coordinate value (Ix, ly) of the pixel which is nK>st likely to be a mallet found in the gk)bal search is 
passed to step S220. 

[0106] In this manner, the mallet is found from the image, and its coordinate value transfenred into that on the table 
coordinate system Is passed to the game status management unit 5030. 

30 

{ Game Status Management) 

[01 07] Rg. 1 3 shows the game f iekJ of the air hockey game of this embodiment This f ieM is d^ined on the two-dimen- 
sional plane on the table 1000, and has x- arxl y-axes. Also, the field has two, right and left virtual goal lines 1200R and 

35 1200U and virtual walls 1300a and 1300b arranged in the up-and-down direction of Rg. 13. The coordinate values of 
the virtual goal lines 1200R and 1200L and virtual walls 1300a and 1300b are known, and never move. On this fieki, 
the virtual image of the puck 1500 moves in correspondence with the rrKSvements of the mallets 260R and 260L 
[0108] The puck 1500 has coordinate information Pp and vekx% infbrmatk>n vp at the present positk>n, the left mallet 
260L has coorcfinate Information Psl and vsl at the present position, and the right mallet 260R has coordinate infbrma- 

40 tbn PsR and v^p at the present position. 

[0109] Rg. 14 is a ftow chart for explaining the processing sequence in the game status management unit 5030. 
[0110] In step S10, the initial position Ppo and initial vebdty vpo of the puck 1500 are set. 
[Oil 1] Note that the puck rrmes at an equal velocity Vp. Also, the puck undergoes perfect elastic collision when it 
collides against a wall or the mullets, i.e., its vek>cityAdirection is reversed. 

45 [Oil 2] The game status management unit 5030 obtains velocity information Vs from the mallet position information 
Ps measured by the mallet position measurement unit 5010. 

[Oil 3] Step S12 is executed at Dt time intervals until either player of the game wins (it is determined in step 850 that 
one player has scored first 3 points of the game). 
[0114] In step S12, the puck position is updated to: 

50 

Pp = Ppo + Vpo* Dt (8) 
After the Initial position and in'rtial velocity are set, the puck position is generally given by: 

55 Pp = Pp+Vp* Dt ° (9) 

In step S14, it is checked if the ifxlated puck position Pp is located within the fieki of player #1 (left player). A cas will 
be explained bek>w wherein the puck 1 500 is located on tii left player side. 
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[01 1 5] It is checked in step SI 6 if the current puck position interferes with the mullet 1 1 0OL If it is determined that the 
puck 1 500 is located at a position where the puck interferes with the mullet 1 1 0OU since this means that the left player 
2000 has nnoved the mallet 260L to hit the puck, the sign of the x-veloctty component of the velocity vpx of the puck 1500 
is inverted in step SI 8 so as to reverse the nrK>tion of th puck 1 500, and the flow advances to step S20. 
5 [01 1 6] Note that in place of simply inverting the sign of the x-velocity component vp^ of the vekx;ity, the puck may be 
controlled to move in the opposite direction by adding the manipulation vekxaty vslx <^ the mullet to the x-dlrection 
velocity vpx of the puck by calculating: 

Vpx = -vpx+vsu 00) 

10 

[0117] On the other hand, if the present puck position does not interfere with the rrullet 1000L of the 1^ player (NO 
in step S16). the flow directiy advances to step S20. 

[01 18] It » checked in step S20 if the puck position P^^^ interferes with the virtual wall 1 300a or 1 300b. If YES in step 
S20, the y-conponent of the pack velocity is inverted in step S22. 
75 [01 19] It is then checked in step S24 if the present puck position is within the goal line of tiie left player. If YES in step 
824, the score of the opponent player, i.e., right (#2) player is incremented in step 826. It is checked in step 850 if either 
player has scored first 3 points. If YES in step S50. the game ends. 

[01 20] If it Is determined in step 81 4 that the puck position Pp is located on the right player side (#2 player side), step 
830 and tiie subsequent steps are executed. The operations in steps 830 to 840 are substantially the same as those 
20 in steps 81 6 to 826. 

[0121 ] In this manner, the game progress state is managed. The game progress state is determined t)y the puck and 
mullet positions, which are input to the image generation unit 5050 (5050L. 5050R), as descrbed above. 

< Correction of Head Position ) 

25 

[0122] Rg. 16 shows the overall control sequence of the processing in the correction processing unit 5040 (5040U 
5040R). The correction processing unit 5040 con-ects view point position data and head posture data, which are calcu- 
lated by the measurement unit 5000 based on the output from the magnetic sensor 220. which output normally includes 
errors, on the basis of the marker position in the image obtained from the CCD camera 240. That is, in this con-ection. 

30 the correction value of the position of the camera 240 (which is closely related to the head position) is calculated from 
the marker position in the image captured by tfie camera 240, and the view-transferring matrix of the view point is cor- 
rected using the correction value. The corrected view-transferring matrix represents the corrected view point position 
and head posture. In other words, the corrected matrix will provide a virtual image at the corrected view point 
[0123] Fig. 26 illustrates the prirKiple of correcting view position and posture of the players according to the first 

35 embodiment, where the correction process is equivalent with obtaining a corrected view-transferring matrix 

[01 24] Referring to Fig. 26, the camera 240 of the player has just been sensing a marker 1 603 into a picked-up image 
300. The position of the mari^ 1603 is represented t)y (xq. yo) in the image coordinate system with respect to the 
image 300. The position of the marker 1603 in the world coordinate system is represented tsy (Xq. Yq. Zq), which is 
known. Since (xq. yo) is an image-coordinate value while (Xq, Yq, Zq) is a wortd coordinate value, they cannot be com- 

40 pared. The first enrtodiment calculates a view-transfening matrix of the camera 240 on the basis of the output of the 
magnetic sensor 220. and then transfers the world coordinate value (Xq, Yq. Zq) into an image-coordinate value (xq. 
y o). On the basis of the fact that difference t>etween coordinate values, (xq. yo) and (x o, y'o). implies an error in the out- 
puts of the sensor 220. a correction matrix AMc for correcting the differerx^e, which will be described later. 
[0125] Apparentiy from Fig. 26, the apparatus according to the first errixxiiment has to identify or discriminates the 

45 marker 1603 from among the other markers within the image 300. The identifk;ation or dtecrimination is made in such 
a manner that (known) three-dimensional world-coordinate values of all the markers are converted into image-coordi- 
nate values by means of the view-transfemng matrix Mq. and that the marker is kjentified or discriminated to be a 
marker the image-coordinate value of that is the ctosest to (xq. yo)- The process associated with the identifk^ation will 
t>e descrit}ed with refererwe to Rgs. 19 arxl 20. 

50 [01 26] The process made by the conrection processing unit 5040 will be described below with reference to Rg. 1 6. 
[01 27] In step 8400, a view-transferring matrix (4 x 4) of the camera 240 is calculated on the basis of the output from 
the magnetic sensor 220. In step 841 0, the coordinate position where each marker is to be observed in the image coor- 
dinate system is predicted on the basis of the view-transferring matrix ot^tained in step 8400, an ideal projection matrix 
(known) of the camera 240 and the three-dimensional position (krxiwn) of each marker. 

55 [0128] On the other hand, the marker position detection unit 5060 (5060L. 5060R) tracks the mari^ in the image 
obtained from the camera 240 (240L. 240R) attached to the head of ttie player. Th marker position detection unit 5060 
passes the detected maricer position to the correction processing unit 5040 (in step S420). The con-ection processing 
unit 5040 (5040L. 5040R) determines tiie marker observed presentiy, i.e., a reference marker in correction, on th basis 
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of the passed marker position information in step S420. In step S430, the conection processing unit 5040 calculates a 
correction matrix AMc that corrects the position/postur the camera 240 th magnetic sensor 220 has detected, on the 
basis of a difference between the prediction coordinate value of the marker calculated in step 410 and the observed 
coordinate value of the marker (marker 1603 in the example of Rg. 26) detected by the detection unit 5060. The ooor- 

5 dinate value of the marker measured by the detection unit 5060 would match the coordinate valu of the marker on the 
basis of the head position detected t)y the sensa 240 as long as outputs of the sensor be correct. Therefore, the differ- 
ence calculated in step S430 represerrts a error of the sensor 240. This enables to correct the positiorV|fx>sture of the 
camera, as descrbed abova The positions arxJ postures of the camera and the view point have a kncwn relationship 
which is represented by a three-dimensional coordinate transfer. Thus, in step S440, the view-transferring matrix of the 

10 view point calculated in step S432 is corrected on the basis of the AMc ^ correcting the position4>osture of the camera. 
The unit 5040 then passes the corrected transferrir^g matrix to the image gerteration unit 5050 (5050U 5050R). 
[01 29] Rg. 1 7 shows the processing sequence for detecting marker positions, performed in the marker position detec- 
tion unit 5060. 

[0130] In step S500, a color image captured by the camera 240 is received. 

75 [01 31 ] After that, a local region search" and "global region search" are respectively executed in steps S502 and S506 
to detect the rrarker position (x, y) expressed by the image coordinate system. Since the local region search" in step 
S502 and "global region search" in step S506 are sut>stahtially the same as the "local region search' (Rg. 9) and 'global 
region search" (Rg. 10] in a mallet search, the desalptions of the above-mentioned search sequerrces are quoted, and 
a detailed descriptfon thereof will be omitted. However, for player #1 (left), the feature amount Is for a marker search in 

20 the quoted control sequence (step S232) uses the pixel value of the pixel of interest: 

(G^)/2 ^^^^ 

25 

Since red markers (1600 to 1604) are used for player #1 , this feature amount expresses the reddish degree. Alsa since 
green markers (1650 to 1654) are used for player #2 (right), the feature amount uses: 



30 (R+B)/2 



Also, these two amounts are used as the feature amount Is in a global search. 

[0132] The marker coordinate value obtained in step S502 or S506 is transferred into that on an kieal image coordi- 
35 nate system free from any distortion using a matrix M (having a size of, e.g., 3 x 3) for conecting distortion in step S510. 
The transferring formula used at that time is: 



40 
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[0133] The processing in step S410 in Rg. 16 will be explained in detail below with reference to Rg. 18. 
[0134] As descrS>ed sbwe, a transferring matrix (4 x 4 view-transferring matrix) from a world ooonfinate system 
45 into the camera coordnate system is obtained in step S400. On the other hand, a transfemng matrix Pq (4 x 4) from 
the camera coordinate system into the image coordinate system is also given as a known value. Also, the three-dimen- 
sbnal coorcfinate position (X, Y, Z) of the martter of interest is given as a known value. 

[01 35] Specifically, if an angle r represents the rotation (roll) in the Z-ax^ direction at the position of the camera 240, 
an angle p represents the rotatfon (pitch) in the X-axis direction at the position of the camera 240, and an angle ^ rep- 
50 resents the rotatfon (yaw) in the Z-axis direction at tf)e position of the canwra 240, (Xq, Yq, Zq) represents a positfon of 
the camera 240, the view-transferring matrix Mq of the camera 240, that is a matrix for performing a transfer from the 
world coordinate system to the camera coordinate system, is given by: 



55 



cosr -sinr 0 0' 
sinr cosr 0 0 
0 0 10 
0 0 0 1 



1 0 0 0 
0 cosp -sinp 0 
0 sinp cosp 0 
0 0 0 1 
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0 1 00 
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Let d be the focal length of the camera 240. w be the width of the imaging surface of the camera, and h be the height 
of th Imaging surface. Then, a matrix Pq for converting camera-coordinate values to the Image coordinate system Is 
given by: 



Pc = 



dAv 0 0 0 
0 d/h 0 0 
0 0-10 
0 0-10 



(15) 



[0136] Consequently, in step S520 of Fig. 18 (corresponding to the step S410 of Fig. 16), the coordinate position (X. 
Y Z) of the marker of interest is transferred into that (Xh. yh> z^) on the image plane using: 



IS 
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. . . (16) 



[0137] In step S522. the predicted coordinate value (x, y) of the marker In the image coordinate system is obtained by: 

25 

Xh 
Zh 

... (17) 

Zh 

35 Thus, through step S410. given are predicted Image-coordinate values (Xj, of the markers i. 

[0138] The "marker determinatk>n" in step S420 will be explained below. Ftg. 19 shows the case wherein the camera 
240 of one player has captured an image 600 on the table 1000. 

[0139] For example, let to M7 be the markers arranged on the table 1000. as indicated by triangular marks. The 
three-dimen8k>nal position Mj of each marker is kncwn. The image 600 includes the markers Mg. M3. Mg. and M7. On 
40 the other hand, the predicted observatbn positk>n of each marker Mj is the one calculated in step S520, and is 
expressed t>y Pj. Also. Q represents the marker position, which is d^ected by and passed from the marker positkxi 
detection unit 5060. 

[0140] The "marker determination" in step S420 determines Pj fi.a, Mj) to whk;h the marker position Q detected by 
the marker positkHi detectk>n unit 5060 correspondSw In Rg. 19, assume that a vector Oj represents the length, i.a. ds- 
45 tance of a vector extendng from the detected marker position Q to the predk;ted position Pj of each marker. 

[01 41 ] Rg. 20 shows the contents of step S420 in detail. That is. the processing in Rg. 20 extracts a marker that yiekis 
a mininrum value from the distances ej of markers i (i = 0 to n) included in the image 600. and outputs the klentifier 1 of 
that marker. That is. 

so i:Min{e|} (18) 

In the example shown in Rg. 19, since the distance ^^m P2 is the shortest, the marker M2 is used as data for cor- 
recting the magn^ sensor output 

[0142] As descrit>ed above, since the camera 240 can capture at least one marker within the activation range (field) 
55 of the player Independently of the movement of the player, the f ieM need not be nan-owed down unlike In the prior art 
[0143] The processing operations in step S430. which is the same as that descrit}ed above with reference to Rg. 1 . 
calculates the transfer matrix AM^ for con'ectlng the position and posture of th canwra on the basis of en-or distance 
e^in and the direction thereof obtained through the equation 18. 
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[0144] In step S432 executed parallely with the steps, the view-transferring matrix at the view position of players is 
calculatecl on the basis the outputs of the sensor 220. Then, letting Myc denote a transfer matrix (which Is known) from 
the camera-coordinate system to the view-coordinate system, calculated is a view-transfemng matrix M'yc at the cor- 
rected view point, using the akx3ve matrix Myc arxi the fbllcwing equation: 

5 

AM V = M • AM Q • M ' yc 

Mv = AMy • My (19) 

10 Where. My represents a viewing transferring matrix of view point which is obtained through the output of the ser^r and 
is prior to the correction. 

[0145] The error distance is calculated in Xertvs of the Image-coordinate system in the first embodiment as illustrated 
in Rg. 16. However, as apparent from Rg. 26 and will be apparent from the second embodiment described later, the 
distance can be calculated in terms of the world coordinate system, thus providing the corrected view-transfemng 
IS matrix at view point 

< Inprovenierrt 0* Detection Precision of Head Positio 

[0146] In the above first embocfiment one camera 240L (240R) for monitoring a front view is arranged on the HMD 
20 21 OL (210R). A marker image on the table 1000 captured by the camera 240 is processed by the processing unit 5060 

to specify the marker in the image (step S420). the head posture of the player, i.e., the posture of the camera attached 

to the head, in other words, a view-transferring matrix by the camera with that posture, is deternrvned. Hcwev^, the first 

embodiment which merely utilizes errors in the terms of the image coordinate system, causes a three-cfimensional cGs- 

placement in the relationship between the cannera and marker. 
25 [0147] In addition, there may be cases for some applications of mixed reality presentation where markers should be 

positioned in a three-dimensional manner. The Identifying method according to the first embocGment illustrated in Rg. 

1 6, deteriorates reliability. 

[0148] The second embodiment is proposed to eliminate the thre&<Jimensional displacement as set forth, by means 
of providing each player with two cameras, and detecting markers in terms of the mrld coordinate systm. The second 
30 embodiment is also proposed to relax the restricting condition that markers be positioned on a flat plane. 

[0149] Specifically, ttie second embodiment employs, as shown in Fig 21 , two cameras 240LR and 240LL (240RR. 
240RL) which are attached to the HMD 210L (21 OR) of the player 2000 (3000). and the postures of tite cameras 240LR 
arxi 240LL (240RR, 240RL) are detected from stereoscopic images obtained from these cameras 240LR and 240LL 
(240RR. 240RL). 

3^ [01 50] The second embodiment uses two cameras mounted on each player so as to cope with three-dimensionally 
arranged markers. However, described will be below the second embodiment which is applied to MR presentation for 
the hockey game using two-dmensionally arranged markers. 

[0151] Rg. 22 partially shows an image processing system according to the second embocfiment That is, Rg. 22 
shows the nrxxlified bkxto of the image processing system of the first embodiment (Rg. 7). More specifically, upon 

40 comparing Rgs. 7 and 22. although the image processing system of the second embodiment is different from the first 
embodiment since it comprises a marker position detection unit 5060L' (SOeOR*) and correction processing unit 5040L' 
(5040R') in adcfition to the two cameras provided to each player, the marker position detection unit 5060L' (SOeOR) and 
correction processing unit 5040L (5040R1 of the second embodiment are merely different in software processing from 
the marker position detection unit 5060L (5060R) and correction processing unit 5040L (5040R) of the first errtxxJi- 

45 ment. 

[01 52] Rg. 23 shows the control sequence especially for the left player 2000 of that of the second emtxxliment. More 
particularly, collabaations among ttie marker position detection unit 5060L*. position/jposture detection unit 5000, and 
correction processing unit 5040L' corresponding to the control sequence in Rg. 16 of the first embodiment will be 
explained below. 

50 [0153] In Rg. 23, the position4x>sture detection unit 5000. which is tiie same as ttiat in the first embodiment, cateu- 
lates the viewing transferring matrix of view point on ttie basis of the output from the nnagnetic sensor 220U in step 
S398. In step S400\ an inverse matrix of the viewing transferring matrix of the camera 240LR is cateulated on the basis 
of tiie output of ttie magnetic sensor 220L This transferring matrix is sent to the correction processing unit 5040*. 
[01 54] Images from the two cameras 240LL and 240LR are sent to ttie marker position detection unit 5060L*. That is, 

55 in step S402, the detection unit 5060L' extracts a mariner image m^ from an image R captured by ttie right camera 
240LR. ImR represents ttie coordinate position of tti extracted marker O e., ttie obsenmtion coordinate position). In step 
S404, the detection unit 5060L' extracts a mariner image r\ from an image L captured by tti right camera 240LL I^l 
represents the ooorcfinate position of the extracted marker. Since the marker images and r\ originate from an iden- 
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tical marker my. a three<iimensional position off the observed marker extracted on the coordinate system off camera 
240LR is calculated from th pair off observed marker coordinate positions (I^r. ImiJ the basis of the principl of trig- 
onometric measurement, in step S406. 

[01 55] In step S404, a corresponding point search off the marker image m|_ Is made using a general stereoscopic view- 
5 Ing technique. Alternatively, in order to attain high-speed processing, the search range may be limited using a known 
epipolar bind. 

[01 56] Steps S41 0'. S420', S422, and S430' in Fig. 23 are the processing operations in the correction processing unit 
5040U. 

[01 57] In step S41 0', the three-dimensional position of the observed marker on the camera coordinate system is 
10 transferred Into a three-dimensk>nal position on the workJ coordinate system i^ng the view-transferring matrix cal- 
culated in step S400'. In step S420*. three-dimensk>nal positions Wn,j (known) of all the markers m{ on the worid coor- 
dinate system are read out from a predetermined memory, and W^j that minimizes the Euclidean distance |WrT,j - W^nl 
between each marker m| arxJ the observed marker mx is determined. In other words, a krxiwn marker closest to the 
observed marker nrix is identified. 
IS [01 58] Although W„ j and are originally the same position, an error vector D (corresponding e In the first embod- 
iment) is likely to be present due to error of the sensor 240. Hence, in step S420*, a marker Is specified that has a coor- 
dinate value Wmj cteMsest to the three-dimensbnal coordinate value (in the world coordinate system) of the tracked 
(observed) marker. Then, in step S430*, a correction vector D representing distance between tiie tracked marker arxj 
the determined marker is cak;ulated from: 

D = W^-W^ (20) 

,and then AMq. which moves the position of the camera by the vector amount, is obtained. In step S440'. a viewing 
transferring matrix of view point is calculated using a method similar to the first embodiment 
26 [01 59] In this manner, since the pres^ invention can Improve position detection off the observed marker In a three- 
dimensional manner using ttie HMD witii two cameras, the position and posture of a view-point can be precisely 
detected, thus, virtual and real Images ffor MR can be smoothly connected. 

<1st Modiffication) 

30 

[01 60] The present invention is not limited to the first and secorxJ enrtxxfimerrts abwe, 

[01 61 ] In the first embodiment the processing for detecting a marker from the Image uses the marker detected first 
as the marker to be tracked, as shown in Rg. 1 7. For this reason, as shown In. e.g.. Rg. 24. when an image 800 Includ- 
ing a marker M^ is obtained In a certain frame, if the marker Is included in an Image region 810 of the sut)sequent frame 

35 although it ts kx^ted at an end portion of the region 810. the marker Mi can t>e determined as a reference marker for - 
correction. However, when, for exanrple, an image 820 is obtained in the subsequent frame, if tiie marker M^ falls out- 
side the regk>n off that image, and a marker M2 Is Included Instead, the reference marker for confection must be changed 
to that marker M2. Such changes in marker are also required when tracking fails, and positional deviation correction 
uses the newly tracked marker. 

40 [01 62] As a problem posed upon switching the marker used in correction, a virtual ot)ject may unnaturally trme due 
to atxupt changes in correction value upon switching the marker. 

[01 63] To prevent this problem, in a modification to be proposed bekiw, the correction value of tiie prevbus frame is 
reflected upon setting the next correction value so as to keep temporal matching between tfiese corection values. 
[0164] More specifically, let v^ be the correction value (a three-dimensional vector representing translation on the 
45 wodd coordinate system) in a certain frame, and v*,.^ t>e the correction value in the previous frame. Then, v', obtained 
by the equation bekm is used as a new correction value: 

v'Ua* v"*'^ +(1 - a)* v^ (21) 

50 where a is a constant (0 ^ a < 1) tfiat defines the degree of influence of the prevkHJS information. The equation above 
implies that a represents the degree off contritxition of the correction value v*,.^ in the prevkxjs frame, and the correction 
value vt obtained in the preserrt frame is used at the degree of contribution off (1 - a). 

[01 65] Witti this control, abrupt changes in correction value can be relaxed, and a threeKlimensk)nal virtual image can 
t>e prevented from being suddenly changed (unnaturally nxyved). By setting a new correction value a at a proper value, 
55 an object can be prevented from unnaturally moving upon switcfung of the marker. 
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<2nd Modification} 

[0166] In the atx^ve emtxxiiment, when a marker cannot be found In a local search, the processing for detecting a 
marker from the image uses a point with th highest similarity on the entire frame as the marker to be tracked independ- 
5 ently of the marker position in the previous frame. In a modification to be proposed below, a marker search is made on 
the basis of the marker position found in the previous frame. Even when the image frame has moved upon nwvement 
of the player, the marker is likely to be present at a position which is not largely offset from the position in the previous 
frame. 

[01 67] Fig. 25 is a view for explaining the principle of search for a marker found in the previous frame from the present 
10 frame. TTie marker search is made along such search route, and if a point having a similarity equal to or higher than a 
given threshold value is found, this point is used as the marker. 

<3rd Modification) 

15 [01 68] The above embodiment uses an optk^al HMD. However, the present invention is not limited to an optical HMD, 
. but may be applied to a vkJeo-see-through HMD. 

< 4th Modification) 

20 [01 69] In the above embodiment, the present invention is applied to the air hockey game. However, the present inven- 
tion is not limited to the air hockey game. 

[0170] In the present inventk>n, sirKe the operations (e.g., mallet operations) of a plurality of operators are sensed 
and captured using a single camera means, the operatbns of the plurality of operators can t>e reproduced in a single 
virtual space. Hence, the present invention can t>e suitat)ly applied to any other collat)orative operations based on at 
25 least two operators (e.g., MR presentation of design works by a plurality of persons, a battle game requiring a plurality 
of players). 

[01 71 ] The processing for correcting the head posture position based on a plurality of markers of the present invention 
is suitable not only for collaborative operations of a plurality of operators but also for a system that presents MR to a 
single operator (or player). 

30 

< Other Modif Nations) 

[01 72] There may be proposed a modifk;atkxi in which more than three cameras be used in the second embodiment. 
[0173] It is enough for the camera 240 of the embodiments to capture at least one marker in images sensed by the 
35 camera. Too many markers vvoukJ result in a number of markers captured in images by the camera, that wouki cause 
erroneous klentif k:atk>n of markers in the atx]ve process for identifying a tracked marker, descrit>ed associated with step 
S430 of Fig. 16 and step S430' of Rg. 23. Therefore, the nurTt>er of the markers may be reduced so that only one 
marker may be captured in the images, if movement of the players may be limited. 

[0174] Further, the po6ition43osture detection apparati^ as set forth outputs the view-transferring matrix at player's 
40 view point. The present invention is not limited to such apparatus, and may be applied to such apparatus as outputs a 
corrected view point of the players in a format of (X, Y, Z. r. p^ ^). where, r derrates rolling angle, p, pitch angle, and t^. 
yawangia 

[01 75] As described above, according to the present inventkxi, since the operations of a plurality of operators are cap- 
tured by a single camera or sensor, the posrtbnal relationshp of the irxiivkiual operators required for presenting MR can 

45 be systematically recognized. 

[0176] Also, according to the present invention, since a plurality of markers are sensed by the camera, at least one 
martcer is captured in that image. Hence, even when the operator moves across a broad work range or moving range, 
the head position of the operator can be tracked, thus alk>wing MR presentatbn over the broad range. 
[01 77] As many apparently widely different embodiments of the present invention can be made without departing from 

50 the spirit and scope thereof, it is to t>e understood that the invention is not limited to the specif k; enrixxiiments thereof 
except as defined in the appended claims. 

Clalir^ 

55 1 . A mixed reality presentation apparatus which generates a three-<fimensional virtual image associated with a col- 
laborative operation to be don by a plurality of operators (2000. 3000) in a predetermined mixed reality environ- 
merrt and displays the generated virtual image on see-through display devk;es (210L. 210R) respectively attached 
to the plurality of operators, comprising: 
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first sensor means (5010. 230) for detecting a posrtion of each of actuators (260U 260R) wtiich are operated 
by the plurality of operators and move as the collaborative operation progresses; 

second sensor means (220, 5000, 5040) for detecting a view point position of each of the plurality of operators 
in an environment of the collaborative operation; and 

5 generation means (5030, 5050R, 5050L) for generating three-dimensional images for the see-through display 

devices of the individual operators, said generation means generating a three-dimensional virtual imag rep- 
resenting an operation result of the collaborative operation that has progressed according to a change in posi- 
tion of each of the plurality of actuators detected by said first sensor nrteans when viewed from the view point 
position of each operator detected by said second sensor means, and outputting the generated three-dimen- 

10 sional virtual image to each see-through display device. 

2. The apparatus according to daim 1 , wherein said first sensor means (501 0) comprises: 

an image sensing camera (230) which includes a maximum range of the actuata witiiin a field of view thereof, 
75 the position of said actuator moving upon operation of the operator; and 

image processing means (SI 00, S200) for performing image-processing to detect a position of each actuator 
in an image obtained by said camm. 

3. TTie apparatus according to daim 1, wherein the actuator Indudes an illuminator emitting light having a predeter- 
20 mined wavelength, and said first sensor means comprises a camera (230) which is sensitive to the light having ttie 

predetermined wavelength. 

4. TTie apparatus according to daim 1 , wherein the actuator is a mallet operated by a hand of the operator. 

25 5. Tlie apparatus according to daim 1 . wherein the see-through display device corrprlses an optical transmission typ>e 
display device (210). 

6. The apparatus according to daim 1, wfierein said second sensor means detects a head position and posture of 
each operator, and calculates a view point position in accordance with the detected head position and postura 

30 

7. The apparatus according to daim 5, wherein said second sensor means conrprises: 

a generator (250) for generating an AC magnetic field; and 

a magnetic sensor (220L, 220R) attached to the head portion of each operator. 

35 

8. The apparatus according to daim 1 , wherein said generation means conrprises: 

storage means (S10 - S52) for storing a rule of the collaborative operation; 

means (5050U 5050R) for generating a virtual image representing a progress result of the collaborative oper- 
40 ation in accordance witii the rule stored in said storage means in conresporxience with detected changes in 

position of the plurality of actuators; and 

means (5050L. 5050R, S440) for generating a three-dimensional virtual image for each view point position by 
transferring a coordinate posrtion for each view point position of each operator detected by said second sensor 
means. 

45 

9. A mixed reality presentation apparatus which generates a ttiree-dniensional virtual image assodated with a col- 
laborative operation to be done by a plurality of operators in a predetermined mixed reality environment, arxJ dis- 
plays the generated virtual image on see-through display devices (210L, 210R) respectively attached to the 
plurality of operators, comprising: 

50 

a camera (230) arranged so as to indude a plurality of actuators operated by the plurality of operators in the 
collaborative operation within a field of view thereof; 

actuator position detection means (230. 5010) for outputting information relating to positions of the actuators 
assodated witt) a coordinate system of that environment on ttie basis of an image sensed by said camera; 
55 sensor means (220U 220R, 5000) for detecting and outputting a view point position of each of ttie plurality of 

operators in the environment of the cdlaborativ operation; and 

image generation means (5030, 5050R. 5050L) for generating a ttiree-c£mensional virtual image of a progress 
result viewed from the view point position of each operator detected by said sensor means to each see-ttirough 
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display device so as to present the progress result of the collaborative operation that has progressed according 
to detected changes in position of the actuator to each operator. 

10. A mixed reality presentation apparatus which generates a three-dimensional virtual imag associated with a col- 
5 laborative operation to be done by a plurality of operators in a predetermined mixed reality environment, and dis- 
plays the generated virtual image on see-through display devices (210L. 210R) respectively attached to the 
plurality of operators, comprising: 

a first camera (230) which sukjstantially includes the plurality of operators within a field of view thereof; 
10 a first processor (501 0) arranged so as to calculate operation positions of the plurality of operators on the basis 

of an image obtained by said first camera; 

a detection device (5000) detecting a view point position of each operator using a plurality of sensors (220U 
220R) attached to the plurality of operators; 

a plurality of second cameras (240U 240R) sensing front fields of the individual operators, at least one second 
IS camera t>eing attached to each of the plurality of operators; 

a se(x>rxi processor (5060L. 5060R) calculating infonration assodated with a 
the basis of each of images from said plurality of second cameras; 

a third processor (5040L. 5040R) correcting the view point position of each operator detected by tt^e sensor 
using the line of sight information from said second processor and outputting the corrected view point position 

20 as a position on a coordinate system of the mixed reality environment; 

a first image processing device (5030. 5050L. 5050R) making the collaborative operation virtually progress on 
the basis of the operation position of each operator calculated by said first processor, and generating three- 
dimensional virtual images representing results that have changed along with the progress of the collaborative 
operation for the plurality of operators; and 

25 a second image processing device (5040L, 5040R, 5050R, 5050L) transferring coordinate positions of the 

three-dimensional virtual images for the individual operators generated by said first image processing device 
in accordance with the individual corrected view point positions calculated by said third processor, and output- 
ting the coordinate-transferred images to the see-through display devices. 

30 11. A method of generating a three-dimensional virtual image associated with a collatx>rative operation to be done 
within a predetermined mixed reality environment so as to display the image on see-through display devices 
attached to a plurality of operators in the mixed reality environment connprising: 

the image sensing step of sensing a plurality of actuators operated by the plurality of operators by a camera 

35 that includes the plurality of operators within a field of view thereof; 

the actuator position acquisition step of calculating information relating to positions of the actuators associated 
with a coordinate system of the environment on the basis of the image sensed by the camera; 
the view point position detection step of detecting a view point position of each of the plurality of operators in 
the environn^ of the collatx>rative operation on the coordinate system of the environment; 

40 the progress step of making the collaborative operation virtually progress in accordance with changes in posi- 

tion of the plurality of actuators calculated in the actuator position acquisitbn step; and 
the image generation step of outputting a three<limenstonal virtual image of a pro^'ess result in the progress 
step viewed from the view point position of each operator detected in the view point position detectbn step to 
each see-through display device so as to present the progress result in the progress step to each operator. 

45 

12. A mixed reality presentation method for generating a three<fimensional virtual image associated with a collabora- 
tive operation to be done t>y a plurality of operators (2000. 3000) in a predetermined mixed reality environment, and 
displaying the generated virtual image on see-through display devtoes (210L, 210R) respectively attached to the 
plurality of operators, comprising: 

50 

the first image sensing step of capturing an image using a first camera (230) whteh substantially irKludes the 
plurality of operators within a f iekl of view thereof; 

the first detection step of detecting operatton positions of the plurality of operators on the basis of the image 
sensed by the first camera; 

55 the second detection step of detecting a view point position of each operata using a plurality of sensors (220L. 

220R) respectively attached to the plurality of operators; 

the second image sensing step of sensing a front field of each operator using each of a plurality of second cam- 
eras (240L. 240R). at least one second camera being attached to each of the plurality of operators; 
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the line of sight calculation step of calculating information associated with a line of si£^ of each operator on 
the basis of each of images obtained from the plurality of second cameras; 

th confection step of correcting the view point position of each operator detected by th sensor on the basts 
of the line of sight information calculated in the line of sight calculation step, and obtaining the corrected view 
5 point position as a position on a coordinate system of the mixed reality environment; 

the generation step of making the collabaative operation virtually progress on the basis of the operation posi- 
tions of the Individual operators detected in the first detection step, and generating thre&dimensional virtual 
images that represent results of the collaborative operation and are viewed from the view point positions of the 
plurality of operators; and 

10 the step of transferring coordinate positions of the three-dimensional virtual images for the individual operators 

generated in the generation step In accordance with the individual corrected view point positions obtained in 
the correction step, and outputting the coorcfinate-transferred images to the see-through display devices. 

1 3. A storage medium which stores a program that implements a method of daim 1 1 . 

IS 

14. A storage medium which^stores a program that inrplements a method of daim 12. 

1 5. A game apparatus which incorp>orates a storage medium of daim 1 1 . 

20 16. A game apparatus having a mixed reality presentation apparatus of daim 1. 

1 7. A posltion/|x)5ture detection apparatus for detecting an operation position of an operator so as to generate a three- 
dimensional virtual image that represents an operation done by the operator in a predetermined mixed reality envi- 
ronment, comprising: 

2S 

a posHton/jposture sensa (220) for measuring a three^iimensional position and posture of the operator to out- 
put an operata*8 position and posture signal; 

a camera sensing in^es of a first plurality of markers arranged at known positions in the environment; 
detection means for processing an image signal from said carrYera, tracking a marker of the first plurality of 
30 markers, and detecting a coordinate value of the tracked marker in a coordinate system; and 

cateulatk)n means for calculating a portion-position and -posture representir)g a positkm and posture of the 
operating portion, on the basis of the coordinate value of the tracked marker detected by said detection means 
and tfie operator's position and posture signal outputted from the position/tx}sture sensor. 

35 18. The apparatus according to daim 1 7, wherein a distance between one marker and another marker of the plurality 
of markers In a direction crossing In front of the operator Is set to t)e larger as the markers are farther from the oper- 
ator. 

19. The apparatus accorcfing to daim 17, wherein where a plurality of operators perform a collatx>ra1ive operation. 
40 markers for one operator are of the same representation manner. 

20. The apparatus accorcfing to daim 17. wherein the portion is a view point position of the operator. 

21. The apparatus according to daim 17, wherein saki detection means uses a marker firstly found within an image 
45 obtained by said camera. 

22. The apparatus according to daim 1 7. wherein said detection means comprises means for searching an image of a 
present scene for a marker found In an image of a previous scena 

50 23. The apparatus according to daim 17, wherein the sensor Is nxHjnted on the head of the operator. 

24. The apparatus accorcfing to daim 1 7. wherein a layout distrikxjticxi density of the plurality of markers in the envircxi- 
ment Is set so that a density distrltxition of markers farther from the operator Is set to be lower than a density dis- 
tribution of markers doser to the operator. 

55 

25. The apparatus accorcfing to daim 17. wherein the first plurality of markers are arranged within the envircximent so 
that at least one marker Is captu'ed within the f iekJ of image of the camera. 
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26. The apparatus according to claim 1 7. wherein said detection means calculates a coordinat of the tracked marker 
in an image coordinate system. 

27. The apparatus according to claim 1 7. wherein said detection means calculates a coordinate of the tracked marker 
5 in camera coordinate system. 

28. The apparatus according to daim 1 7, wherein the first plurality of markers are depicted on a planar table arranged 
within the environment. 

10 29. The apparatus according to daim 27, wherein said first plurality of markers are arranged in a thre&dimensk>nal 
manner. 

30. The apparatus according to daim 1 7. wherein said detection means corrprises identifying means for identifying a 
marker to t>e tracked from among said first plurality of markers. 

75 

31. The apparatus according to daim 17, wherein said detectk)n means comprises means for selecting, where said 
7 detection means detects a second plurality of markers within an image capture by said camerar one marker to be 

tracked from anrxxig saki second plurality of markers. 

20 32. The apparatus according to daim 17, wherein the identifying means identifies a marker selected by the selectkm 
means in terms of an image coordinate system. 

33. The apparatus according to daim 17, wherein the identifying means kientifies a marker selected t>y the selection 
means in terms of a wodd coordinate system 

25 

34. The apparatus according to daim 30, wherein the identifying means comprises: 

means for detecting a signal representing a positkxTy|x>sture of the camera; 

means for converting threeKjimensk>nal coordinates of said first plurality of markers in the worid coordinate 
30 systm into a coordinate value in terms of the image coordinate system, in accordance with the signal repre- 

senting positk>n/|30Sture of the camera; arxl 

means for identifying a marker to be tracked by comparing the coordinates of the first plurality of markers in the 
worid coordinate system and an image coordinate value of the tracked marker. 

35 35. The apparatus according to daim 30, wherein the identifying means corrprises: 

means for detecting a signal representing a posftion/|x>sture of the camera; 

means for converting a coordinate of the tracked marker in terms of a camera coordinate system into a coor- 
dinate value in terms of the world coordinate system; and 
40 selection means for selecting sad at least one marker to be tracked by comparing coordinates of the tracked 

marker and coordinates of the first plurality of markers, in terms of the world coordinate system. 

36. The apparatus according to daim 30, wherein the operation portkxi indudes a vie position of the operator. 

45 saki cakujiation means obtains a position/jposture signal at a view point of the operator on the t>asis of: 

saki operator position/posture signal, and 

a distance difference between an image coordinate value of the tracked marker and a coordinate value of the 
tracked marker whk^h is converted from a three dimenskKial coordinate of the marker in the wodd coordinate 
system. 

50 

37. The apparatus according to daim 30, wherein the operation portion indudes a vie positkm of the operator, 

saki cateulation means obtains a position/|x>sture signal at a view point of the operator on the basis of: 
saki operator positk>n/jposture signal, and 
55 a distance difference between a coordinate value of the tracked marker which is converted from the camera 

coordinate system into the worid coordinat system and a three dimensk>nal coordinate of the marker in th 
worU coordinate system and a coordinate value of the tracked marker. 
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38. The apparatus according to daim 17, wherein the sensor comprises a magnetic sensor mounted on the head of 
the operator. 

39. The apparatus according to daim 17, wherein said camera indudes a plurality of canneras units attached to the 
5 operator's head, and 

said identifying means identifies a marker to be tracked in the worU coordinate system. 

40. The apparatus according to daim 39, wherein said camera indudes two cameras units. 

10 

41 . A mixed reality presentation apparatus comprising: 

a work table having a first plurality of markers arranged at known positions: 
a position/jposture sensa attached to an operator to detect a head posture of the operator; 
75 a camera being set to capture at least one of the first plurality of markers within a field of view of the camera; 

a detectbn rrieans for processing an image signal from the camera, tracking a marker from among the first plu- 
rality of markers, arxJ detecting a coordinate valuextf a 

calctdation means for calodating a position/jposture signal representing a position and posture of the operator's 
viewpoint on the basis of the coordinate value of the tracked marker detected by said detection means and an 
20 operator's head position/jposture signal outputted from the position/lposture sensor; and 

generation means for generating a virtual image for presenting a mixed reality at the view point in accordance 
with the calculated position/posture signal. 

42. The apparatus according to daim 41 , wherein a distance between one marker and another marker of the plurality 
2S of markers in a direction crossing in front of the cperata is set to be larger as the markers are farther from the oper- 
ator. 

43. The apparatus accordng to daim 41, wherein wtiere a plurality of operators perform a collaborative operation, 
markers for one operator are of the same representation manner. 

30 

44. The apparatus accorcfing to daim 41 , wherein said detection means comprises: 

means for tracking a marker witiiin an image obtained by the camera; and 

means for outputting a coordinate value of the tracked marker in an image coordinate systenn 

35 

45. The apparatus according to daim 44, wherein said detection means uses a marker firstly found within an image 
obtained by sakJ camera 

46. The apparatus according to daim 44, wherein said detection means comprises means for searching an iniage of a 
40 present scene for a marker found in an image of a previous scena 

47. The apparatus accorcfing to daim 44. wherein a layout distribution density of the plurality of markers in the environ- 
ment is set so that a density distribution of markers farther from the operator is set to be lower than a density dis- 
tribution of markers doser to the operator. 

45 

48. The apparatus accorcfing to daim 41 , wherein the first plurality of markers are arranged wrtiiin the environnf>ent so 
that at least one marker is captured within the f iekj of image of the camera. 

49. The apparatus according to daim 41 , wherein said detection means calculates a coordinate of the tracked marker 
50 in an intage coordinate system. 

50. The apparatus according to daim 41 , wherein said detection means calculates a coordinate of the tracked marker 
in camera coordinate systOT. 

55 51 . The apparatus according to daim 41 , wherein the first plurality of mariners are depicted on a planar table arranged 
within tiie nvironment 

52. The apparatus according to daim 51, wherein said first plurality of mari^rs ar arranged in a ttireeKiimensional 
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manner. 

53. The apparatus according to daim 41 , wherein said detection means comprises: 

5 identifying means for identifying a marker to t>e tracked from among said first plurality of markers. 

54. The apparatus according to daim 41 . wtierein the identifying means identifies a marker in ternDs of an image coor- 
dinate system. 

10 55. The apparatus according to daim 41 , wherein the identifying means Identifies a marker in terms of a world coordi- 
nate system. 

56. A position/jx>sture detection method for detecting an operation position of an operator so as to generate a three- 
dimensk>nal virtual image assodated with an operation to be done by the operator in a pred^ermlned mixed reality 

15 environment, conrprising: 

the step of measuring to output an op^tor positi6n4>osture signal indicative of a ttvee-dimensbnal position 
and posture of the operata; 

the step of processing an Image signal from a camera which captures a plurality of markers arranged in the 
20 environment, tracking at least one marker and detecting a coordinate of said at least one marker; and 

outputting a head position4x>sture signal indicative of a position and posture of the head of the operator, on the 
basis of the coordinate of the tracked marker arxl the measured operator po6ition4x>sture signal. 

57. A method of presenting a mixed reality In accordance with a position and posture of view point of the operator 
25 detected by the metfKXI according to daim 56. 

58. A method according to daim 56, further conrprising: 

tracking at least one marker by processing Image signals sensed by a plurality of camera units nrx)unted on the 
30 head of the operator, with a tri-arYgle measurement method. 

59. A storage medium which stores a computer program tfiat describes the method of daim 56. 

60. A storage medium wfiich stores a computer program that descrbes a m^od of daim 57. 

35 

61. A storage medium wfiich stores a computer program ttiat descrbes a method of daim 58. 

62. A positxHi^posture d^ection apparatus for detecting an operation positk>n of an operator, compr^'ng: 

40 a positiorVposture sensor for nieasuring a three-dimensk>nal position and posture of tf>e operator to output an 

operator's position and posture signal; 

a camera sensing im^es of a first plurality of markers arranged at known positions In the environment; 
detection means for processing an image signal from sakJ camera. trackir>g a marker of the first plurality of 
markers, and detecting a coordinate value of the tracked marker in a coordinate sy^em; and 
45 correction means for correcting an output signal from the sensor on the t>asis of coordinate value of tfie tracked 

marker. 

63. Apparatus for playing a game, comprising a plurality of actuators for use by respective players and mixed reality 
presentation means for presenting to each of the players a view of at least one virtual marker responsive to opera- 

50 ta movement of the actuators. 

64. Apparatus as claimed in daim 63 wherein the mixed reality presentation means compr^es at least one of the fea- 
tures recited in daims 1 to 62. 

55 
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